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ANTIMICROBIAL-PROTEIN-PRODUCING 
ENDOSYMBIOTIC MICRO-ORGANISMS 

This invention relates to endosymbiotic 
micro-organisms having the ability to produce 
plant-derived antimicrobial proteins. 

In this context, 'antimicrobial' proteins are 
defined as proteins possessing at least one of the 
following activities: antifungal activity (which may 
include anti-yeast activity); antibacterial 
activity. Activity includes a range of antagonistic 
effects resulting in partial inhibition or death. 
'Plant-derived' proteins are capable of being 
isolated from the seed or other parts of one or more 
plant species. 

Various proteins with antimicrobial activity 
have been isolated from plant sources, and such 
proteins are often believed to take part in host 
defence mechanisms directed against invading or 
competing micro-organisms. Some of the proteins are 
well-characterised, and their amino acid sequence 
may be known. In some cases, the cDNA or gene 
encoding the protein has also been isolated and 
sequenced. 

To keep out potential invaders, plants produce 
a wide array of antifungal compounds, either in a 
constitutive or an inducible manner. Several 
classes of proteins with antifungal properties have 
now been identified, including chitinases, 
beta-1 , 3-glucanases , r i bo some- inactiva ting proteins , 
thionins, chi tin-binding lectins and zeamatins. 
These proteins have gained considerable attention as 
they could potentially be used as biocontrol agents. 
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The chitinases (Schlumbaum et al, 1986, Nature, 324, 
363-367) and beta-1 , 3-glucanases have weak 
activities by themselves, and are only inhibitory to 
plant pathogens when applied in combination (Mauch 
et al, 1988, Plant Physiol, 88, 936-942). The 
chitin-binding lectins can also be classified as 
rather weak antifungal factors (Broekaert et al, 
1989, Science, 245, 1100-1102; Van Parijs et al, 
1991, Planta, 183, 258-264). Zeamatin is a more 
potent antifungal protein but its activity is 
strongly reduced by the presence of ions at 
physiological concentrations (Roberts and 
Selitnermikof f , 1990, G Gen Microbiol, 136, 
2150-2155). Permatins are also known plant 
antifungal proteins (Vigers et al, 1991, Molec 
Plant-Microbe Interact, 4, 315-323; Woloshuk et al, 
1991, Plant Cell, 3, 619-628). Finally, thionins 
(Apel et al, 1990, Physiol Plant, 80, 315-321) and 
ribosome-inactivating proteins (Roberts and 
Selitrennikof f , 1986, Biosci Rep, 6, 19-29; Leah et 
al, 1991, J Biol Chem, 266, 1564-1573) have 
antifungal activity and are known to be toxic for 
human cells (Carrasco et al, 1981, Eur J Biochem, 
116, 185-189; Vernon et al, 1985, Arch Biochem 
Biophys, 238, 18-29; Stirpe and Barbieri, 1986, FEBS 
Lett, 195, 1-8). 

Other groups of potent antimicrobial proteins 
with broad spectrum activity against plant 
pathogenic fungi (and often some antibacterial 
activity) are capable of isolation from certain 
plant species. We have previously described the 
structural and antifungal properties of several such 
proteins , including : 

the small-sized cysteine-rich proteins Mj-AMPl 
(antimicrobial protein 1) and Mj-AMP2 occurring in 
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seeds of Mirabilis jalapa (Cammue BPA et al, 1992, j 
Biol Chem, 267:2228-2233; International Application 
Publication Number W092/15691 published on 17 
September 1992); 

Ac-AMPl and AC-AMP2 from Amaranthus caudatus 
seeds (Broekaert WF et al , 1992, Biochemistry, 
37:4308-4314; International Application Publication 
Number W092/21699 published on 10 December 1992); 

Ca-AMPl from Capsicum annuum, Bm-AMPl from 
Briza maxima and related proteins found in other 
plants including Delphinium . Catapodium . Baptisia 
and Microsensis species (International Patent 
Application Number PCT/GB93/02179 filed on 22 
October 1993); 

Rs-AFPl (antifungal protein 1) and Rs-AFP2 from 
seeds of Raphanus sativus (Terras FRG et al, 1992, j 
Biol Chem, 267:15301-13309) and related proteins 
such as Bn-AFPl and Bn-AFP2 from Brassica napus , 
Br-AFPl and Br-AFP2 from Brassica rapa , Sa-AFPl and 
Sa-AFP2 from Sinapis alba , At-AFPl from Arabidopsis 
thaliana, Dm-AMPl and Dm-AMP2 from Dahlia merckii , 
Cb-AMPl and Cb-AMP2 from Cnicus benedictus, Lc-AFP 
from Lathyrus cicera , Ct-AMPl and Ct-AMP2 from 
Clitoria ternatea (International Patent Application 
Publication Number WO93/05153 published 18 March 
1993); 

Rs-nsLTP (non-specific lipid transfer protein) 
from Raphanus sativus (International Patent 
Application Publication Number WO93/05153 published 
18 March 1993) . 

These publications are specifically incorporated 
herein by reference. 

These and other plant-derived antimicrobial 
proteins are useful as fungicides or antibiotics to 
improve the disease-resistance or disease-tolerance 
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of crops either during the life of the plant or for 
post-harvest crop protection. The proteins may be 
extracted from plant tissue or produced by 
expression within micro-organisms. Exposure of a 
plant pathogen to an antimicrobial protein may be 
achieved by application of the protein to plant 
parts using standard agricultural techniques (eg 
surface spraying). The proteins may also be used to 
combat fungal or bacterial disease by expression 
within plant bodies (rather than just at the 
surface). DNA encoding the antimicrobial proteins 
(which may be a cDNA clone, a genomic DNA clone or 
DNA manufactured using a standard nucleic acid 
synthesiser) may be transformed into a plant, and 
the proteins expressed within transgenic plants. 

It is an object of the present invention to 
provide an alternative method to deliver the 
plant-derived antimicrobial protein to its desired 
site of action. Such a method should be generally 
applicable to a wide range of plant species and may 
be easier or more effective than other methods. 

Certain micro-organisms have the ability to 
enter into non-pathogenic endosymbiotic 
relationships with a plant host. These 
naturally-occurring micro-organisms, hereinafter 
called 'endophytes' , are capable of infecting the 
plant host and being harboured within the plant but 
create no visible manifestations of disease. Such 
organisms include mutualistic and commensalistic 
endophytic organisms. The range of endophytes also 
includes organisms which can exist in the vascular 
tissues of the plant and organisms which can exist 
within the intercellular spaces of the plant. 
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A method of endophyte-enhanced protection of 
plants has been described in a series of patent 
applications by Crop Genetics International 
corporation, which are discussed below and 
incorporated specifically herein by reference. 

„.o. Application Publication Number 

WO90/13224 (published 15 November 1990) describes 
the introduction of an endophytic bacterium into a 
commercially-valuable plant (such as tobacco, 
potato, muskmelon) to enhance protection against 
disease (such as tobacco mosaic virus (TMV) 
Pseudomonas syringae pv. tabaci . Clavibacter 

^"^^P- michiaanese, potato virus X and 
Y, Fusarium sp. and other vascular wilt fungi,. The 
endophyte is preferably Clavibacter x^^ii subsp 
£X0od2£tis (Cxc). The endophyte may be introduced 
into the plant by several methods including 
impregnating the seed with a suspension of the 
endophyte, using a seed coating, injecting the 
plant, and using a soil or foliar drench. 

The endophyte may be unmodified, genetically 
modified (as discussed below) or formulated- with 
other components to provide additional beneficial 
properties. 

The endophyte may be genetically modified to 
produce agricultural chemicals. m this case, 
genetic material is derived from an agricultural- 
chemical-producing micro-organism and combined with 
a suitable endophyte. Combination of genetic 
material is achieved by: 

(a) forming a fusion hybrid between an endophytic 
bacterium and an agricultural-chemical- 
producing bacterium (European Patent 
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Publication Number EP-125468-B1 , published 28 
October 1992) ; or 
(b) the use of recombinant techniques (insertion of 
DNA encoding an agricultural chemical); for 
example, transforming the endophyte with an 
expression vector which directs production of 
an agricultural chemical (International 
Application Publication Number WO91/10363, 
published 24 July 1991 and International 
Application Publication Number WO87/03303, 
published 4 June 1987) • 
use of the modified endophyte can improve the 
disease tolerance of a plant host (when compared to 
direct application of the agrochemical or 
agrochemical-producing-bacterium) . The endophyte 
may be further improved by additional genetic 
modification using natural or artificial techniques 
(such as mutagenesis). For example, the endophyte 
may be modified to excrete the agricultural chemical 
in a particular form. 

The source of DNA encoding the agricultural 
chemical is a suitable micro-organism. Such 
agricultural-chemical-producing micro-organisms are 
described in Table I (page 27) of International 
Application Publication Number WO91/10363 and 
include a wide variety of micro-organisms producing 
antibiotics, antifungal agents, antibacterial 
agents, antiviral agents, insecticides, nematocides, 
miticides, herbicides, fertilisers (nitrogen-fixing 
or phosphate solubilising agents), plant growth 
regulators or anti-feeding agents. 

Suitable endophytes include Agrobacterium 
tumef aciens , Erwinia carotovora , Pseudomonas 
solanacearum, Pseudomonas syringae , Xanthomonas 
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. Pseudomonas syrinc^ae for monocotyledonous plants 
C lavibacter x^^ subsp. x^rli and Clavibact.r xyU 
subsp. cynodontis (Cxc) are particularly usefiTTor 
grasses such as maize, sorghum and the like. 

The agricultural-chemical-producing endophytes 
-y be used to enhance disease protection in any 
Plant including those producing fruit, vegetables 
and flowers, trees, field and row plants such as 
corn, sorghum, wheat, barley, oats, rice, brome 
grass, sugar cane, cotton, potatoes, tomatoes, 
cabbage, cauliflower, broccoli, melons, cucumbers, 

^OBByTsllTTlT Publication Number 

WO88/09114 (published 1 December 1988) describes 
plants colonised by beneficial endophytic 
micro-organisms obtained by germination of seeds 
-pregnated with the endophytes. The endophyte may 
be a strain of the genus Clavibacter or Rhieobium, 

and may be genetically modified to produ^Ti;; 

agricultural chemical. The seed may be from the 
Grammeae, Leguminosae or Halvaceae family 
international Application Publication Number 
W091/1 907 (published 22 August 1991) describes the 
production of modified seed (particularly rice) 
containing an unmodified or modified endophyte 
(particularly Cxc) to produce a plant of reduced 
Stature. 

Crop Genetics International have already 
developed a corn bioinsecticide based upon this 
endophyte technology (trademark. INCIDE Technology, 
The INCIDE bioinsecticide consists of the endophyte' 
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Clavibacter xyli subsp. cynodonits (Cxc) which has 
been genetically modified with an endotoxin gene 
derived from the bacterium Bacillus thuringiensis , 
and thus expresses a protein which is toxic to 
certain insect larvae. If corn seed is inoculated 
with the INCIDE vaccine, the modified Cxc inhabits 
the vascular tissue of plants grown from this seed 
and the crop is protected from attack by cornborer 
larvae. However, there may be an associated yield 

reduction in certain crop species or varieties 

(Agrow, 13/11/92, no 172, p 6). 

European Patent Application Publication Number 
185005 (Monsanto Co, published 18 June 1986) also 
describes a "plant-colonizing micro-organism" 
(herein called an endophyte) which has been 
genetically modified to express a B thuringiensis 
"protein. 

When using an agricultural-chemical-producing 
endophyte to enhance disease protection in a plant, 
the source of DNA encoding the agricultural chemical 
is a suitable micro-organism. Plant-derived DNA 
sequences encoding antimicrobial proteins have not 
previously been used to modify the endophytes. 

TO improve disease-resistance or 
disease-tolerance of crops, plant-derived 
antimicrobial proteins may be produced within the 
crop plant by expression of a gene incorported into 
the plant genome. This may involve over-expression 
of an inherent protein or expression of a protein 
derived from another plant species. We now provide 
the means to express the antimicrobial protein 
within the crop plant without requiring plant 
transformation . 
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According to the invention, there is provided 
a method of producing antimicrobial-protein- 
producing micro-organisms capable of entering into 
endosymbiotic relationships with a plant host 
comprising the combination of genetic material 
encoding a plant-derived antimicrobial protein with 
an endophyte . 

There is further provided antimicrobial- 
protein-producing micro-organisms produced according 
to the method of the invention, and seed and plants 
treated with said micro-organisms. Antimicrobial 
protein may thus be expressed within the plant by an 
endophyte rather than being directly expressed by 
the host crop plant. 

As noted above, use of a genetically modified 
endophyte to deliver an agricultural chemical 
(including antifungal agents) has been described. 
However, the agricultural chemical was expressed 
from a gene derived from another micro-organism 
(usually a bacterium). Genes encoding plant -derived 
antimicrobial proteins have not been previously used 
(or suggested) to modify the endophyte. 

Examples of plants which may be protected using 
the antimicrobial-protein-producing micro-organisms 
include field crops, cereals, fruit and vegetables 
such as: canola, oil seed rape, sunflower, tobacco, 
sugarbeet, cotton, soya, maize, wheat, barley, rice, 
sorghum, tomatoes, mangoes, peaches, apples, pears, 
strawberries, bananas, melons, potatoes, carrot, 
lettuce, cabbage, onion. 



DNA encoding any plant-derived antimicrobial 
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protein may be used in the method according to the 
invention (for example, DNA encoding chitinaseS/ 
hevein, lectins, thionins, etc). 

By way of example only, DNA encoding the 
following plant-derived antimicrobial proteins may 
be used in the method according to the invention: 
Mj-AMPl, Mj-AMP2, Ac-AMPl , AC-AMP2 , Ca-AMPl , 
Bm-AMPl, Rs-AFPl, RS-AFP2 , Br-AFPl, Br-AFP2, 
Bn-AFPl, Bn-AFP2, Sa-AFPl, Sa-AFP2, At-AFPl, 
Dm-AMPl, Dm-AMP2, Cb-AMPl , Cb"AMP2, Lc-AFP, Ct-AMPl, 
Ct-AMP2, Rs-nsLTP, These proteins show a high level 
and wide spectrum of antifungal activity, and will 
be particularly useful. for improving 
disease-resistance or disease-tolerance in crops. 
In particular, one or more of these potent 
antimicrobial proteins may be used in conjunction 
with a slower-growing endophyte as a relatively low 
dose of the highly active protein may be needed to 
provide disease protection. The presence of a 
slower-growing endophyte may result in less 
diversion of the host plant's metabolic resources, 
maintaining crop yield. In addition, use of these 
potent plant-derived antimicrobial proteins may 
extend the range of plant hosts most suitable as 
targets for this type of disease protection. Even 
endophytes which are relatively poor colonisers of 
certain plant species (such as Cxc on wheat) may be 
engineered to express one or more of the potent 
proteins to give the desired level of protection to 
the host plant. ^ 

The invention will now be described by way of 
example only, with reference to the Sequence Listing 
in which: 

SEQ ID N0:1 is the amino acid sequence of 
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Mj-AMPl. 

SEQ ID NO: 2 is the amino acid sequence of 
Mj-AMP2. 

SEQ ID NO: 3 is the nucleotide sequence of 
Mj-AMPl. 

SEQ ID NO: 4 is the amino acid sequence of 
Mj-AMPl deduced from SEQ ID NO: 3. 

SEQ ID NO: 5 is the nucleotide sequence of 
Mj-AMP2. 

SEQ ID NO: 6 is the amino acid sequence of 
Mj-AMP2 deduced from SEQ ID N0:5, 

SEQ ID NO: 7 is the amino acid sequence of 
Ac-AMPl . 

SEQ ID NO: 8 is the amino acid sequence of 
Ac -AMP 2. 

SEQ ID N0:9 is the nucleotide sequence of 
AC-AMP2 . 

SEQ ID NO: 10 is the amino acid sequence of 
AC-AMP2 deduced from SED ID N0:9- 

SEQ ID NO: 11 is the amino acid sequence of Ca- 

AMPl. 

SEQ ID N0:12 is one possible predicted DNA 
sequence for the Ca-AMPl gene. 

SEQ ID N0:13 is the amino acid sequence of 
Bm-AMPl. 

SEQ ID N0:14 is one possible predicted DNA 
sequence for the Bm-AMPl gene. 

SEQ ID NO: 15 is the amino acid sequence of 
Rs-AFPl. 

SEQ ID NO: 16 is the amino acid sequence of 
RS-AFP2 . 

SEQ ID NO: 17 is the amino acid sequence of 
Br-AFPl. 

SEQ ID NO: 18 is the amino acid sequence of 
Br-AFP2 . 

SEQ ID NO: 19 is the amino acid sequence of 
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Bn-AFPl . 

SEQ ID N0:20 is the amino acid sequence of 
Bn-AFP2. 

SEQ ID NO: 21 is the amino acid sequence of 
Sa-AFPl. 

SEQ ID NO: 22 is the amino acid sequence of 
Sa-AFP2. . 

SEQ ID NO: 23 is the amino acid sequence of 
At-AFPl . 

SEQ ID NO: 24 is the amino acid sequence of 
Dm- AMP 1 . 

SEQ ID NO: 25 is the amino acid sequence of 
Dm- AMP 2, 

SEQ ID NO: 26 is the amino acid sequence of 
Cb-AMPl • 

SEQ ID NO: 27 is the amino acid sequence of 
Cb-AMP2. 

SEQ ID NO: 28 is the amino acid sequence of 
LC-AFP. 

SEQ ID NO: 29 is the amino acid sequence of 
Ct-AMPl. 

SEQ ID NO: 30 is the amino acid sequence of 
Rs-nsLTP . 

SEQ ID NO: 31 is one possible predicted DNA 
sequence for the Dm-AMPl gene. 

5EQ ID NO: 32 is one possible predicted DNA 
sequence for the Dm-AMP2 gene. 

SEQ ID NO: 33 is one possible predicted DNA 
sequence for the Cb-AMPl gene. 

SEQ ID NO: 34 is one possible predicted DNA 
sequence for the Cb-AMP2 gene. 

SEQ ID NO:35 is one possible predicted DNA 
sequence for the Lc-AFP gene. 

SEQ ID NO: 36 is one possible predicted DNA 
sequence for the Ct-AMPl gene. 

SEQ ID NO: 37 is the full length cDNA sequence 
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SEQ ID NO:38 is 
Rs-AFPl deduced from 

SEQ ID NO:39 is 
RS-AFP2* 

SEQ ID NO:40 is 
RS-AFP2 deduced from 

SEQ ID NO: 41 is 
PGR assisted site di 

SEQ ID NO:42 is 
R5-AFP2 deduced from 
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the amino acid sequence of 
SEQ ID NO:37. 

the truncated cDNA sequence of 

the amino acid sequence of 
SEQ ID NO:39. 

the full length DNA sequence of 
ected mutagenesis of RS-AFP2. 
the amino acid sequence of 
SEQ ID NO: 41. 



EXAMPLE 1 

Expression of Raphanus sativus Antifungal 
Protein 2 (RS-AFP2) by the endophyte Clavibacter 
xyli subsp . cynodontis ( Cxc ) . 

The RS-AFP2 protein is expressed in a system 
analogous to that which is known to express the 
Bacillus thurinqiensis endotoxin. An 
oligonucleotide sequence coding for the antifungal 
protein Rs-AFP2 is prepared using Cxc-compatible 
codons. This oligonucleotide sequence comprises 
appropriate restriction sites to enable it to be 
exchanged with the Bacillus thuringiensis endotoxin 
gene sequence present in the INCIDE Cxc bacterium. 

Southern analysis is used to check that Cxc is 
transformed with the Rs-AFP2 gene. If the result 
is positive, the bacterium is cultured to determine 
whether it is capable of expressing Rs-AFP2 protein 
in vitro . Western analysis and antifungal assays 
are carried out on the fermentation products to 
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determine whether the protein is produced in the 
correctly folded form as found in the native plant. 
It is known that the protein loses antifungal 
activity when it is reduced and hence unfolded. 



EXAMPLE 2 
Protection of rice plants using 
Rs-AFP2-producing Cxc as an antifungal agent. 

Cultures of Cxc which are capable of 
expressing RS-APP2 protein are used to treat rice 
plants. by a soil drench or seed treatment method. 

The rice plants are challenged with rice 
blast, Pyricularia oryzae and assessed for 
increased resistance to the pathogen over non-Cxc- 
infected plants. RS-AFP2 is known to be active 
against P oryzae in in vitro tests. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION; 

(i) APPLICANT: ZENECA, Limited 

(ii) TITLE OF INVENTION: ANITMICROBIAL-PROTEIN- PRODUCING 
ENDOSYMBIOTIC MICRO-ORGANISMS 

(iii) NUMBER OF SEQUENCES: 42 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: ICI GROUP PATENTS SERVICES DEFT 

(B) STREET: PO BOX 6, SHIRE PARK, BESSEMER ROAD, 

(C) CITY: WELWYN GARDEN CITY 

(D) STATE: HERTFORDSHIRE 

(E) COUNTRY: UNITED KINGDOM 

(F) ZIP: AL7 IHD 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: PatentIn Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION; 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: GB 9300281.4 

(B) FILING DATE: 08-JAN-1993 

(viii) ATTORNEY/ AGENT INFORMATION: 

(A) NAME: ROBERTS, TIMOTHY W 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 44 707 323400 

(B) TELEFAX: 44 707 337454 
<C) TELEX: 94028500 ICIC G 



(2) INFORMATION FOR SEQ ID N0:1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



SUBSTITUTE SHEET 
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(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID N0:1: 

Gin Cys He Gly Asn Gly Gly Arg Cys Asn Glu Asn Val Gly Pro Pro 
15 10 15 

Tyr Cys Cys Ser Gly Phe Cys Leu Arg Gin Pro Gly Gin Gly Tyr Gly 
20 25 30 

Tyr Cys Lys Asn Arg 
35 

(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:2: 

Cys He Gly Asn Gly Gly Arg Cys Asn Glu Asn Val Gly Pro Pro Tyr 
1 5 10 15 

Cys Cys Ser Gly Phe Cys Leu Arg Gin Pro Asn Gin Gly Tyr Gly Val 
20 25 30 

Cys Arg Asn Arg 
35 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 360 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CTTCCCGTTG CCTTCCTCAA ATTCGCTATT GTGTTGATTC TCTTCATTGC CATGTCCGCA 60 
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ATGATAGAAG CACAATGCAT AGGAAATGGA GGAAGATGTA ACGAGAACGT GGGGCCACCA 120 

TACTGCTGCT CCGGTTTCTG CCTCCGTCAA CCTGGACAAG GTTATGGATA TTGTAAGAAC 180 

CGCTGAGCAA GAGCATGAAA GCAAGGCCAA TGTGTGGTCT ACTAATTTAG CCTCAAATGT 240 

TATTTATTTG CATGTCTTGT GTTTCTTAAT TACCTTCTTT GTGTCTAAGA AGGTATAGAT 300 

CAATAGTTTC TACTTTACTA CTATGAATAA GAGGCTTTGA TTTGGTTTAA AAAAAAAAAA 360 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 61 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Leu Pro Val Ala Phe Leu Lys Phe Ala He Val Leu He Leu Phe He 
15 10 15 

Ala Met Ser Ala Met He Glu Ala Gin Cys He Gly Asn Gly Gly Are 
20 25 30 

Cys Asn Glu Asn Val Gly Pro Pro Tyr Cys Cys Ser Gly Phe Cys Leu 
35 40 45 

Arg Gin Pro Gly Gin Gly Tyr Gly Tyr Cys Lys Asn Arg 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 433 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATATCATTCA AATATACTAA ACTAATTATA AAAAATGGCT AAGGTTCCAA TTGCCTTTCT 60 
CAAATTCGTC ATCGTGTTGA TTCTCTTCAT TGCCATGTCA GGCATGATAG AAGCATGCAT 120 
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AGGAAATGGA GGAAGATGTA ACXSAGAACGT GGGCCCACCA TACTGCTGTT CGGGTTTCTG 180 

CCTCCGTCAA CCTAACCAAG GTTACGGTGT TTGCAGGAAC CGCTAATAAG CAAAGCCCAA 240 

AGTGTGGGTC ACAAAATAGT AGAGTTTAGC CTCAAATGTG GTTTATATAT GTAACAATCT 300 

TATATGTGTT TCTCTTGTGT TTCTTAATTA CCTTCTTTGT GTCTAAGAAG GTATGGATAA 360 

ATAGTTTGTA CTTTACTATT ATGGTTTTTT CTTATATCAA TAAGAGGCTT TAATTAAAAA 420 

AAAAAAAAAA AAA 433 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Lys Val Pro lie Ala Phe Leu Lys Phe Val He Val Leu He 
15 10 15 

Leu Phe He Ala Met Ser Gly Met He Glu Ala Cys He Gly Asn Gly 
20 25 30 

Gly Arg Cys Asn Glu Asn Val Gly Pro Pro Tyr Cys Cys Ser Gly Phe 
35 40 45 

Cys Leu Arg Gin Pro Asn Gin Gly Tyr Gly Val Cys Arg Asn Arg 
50 55 60 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Val Gly Glu Cys Val Arg Gly Arg Cys Pro Ser Gly Met Cys Cys Ser 
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Gin Phe Gly Tyr Cys Gly Lys Gly Pro Lys Tyr Cys Gly 
20 25 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Val Gly Glu Cys Val Arg Gly Arg Cys Pro Ser Gly Met Cys Cys Ser 
1 5 10 15 

Gin Phe Gly Tyr Cys Gly Lys Gly Pro Lys Tyr Cys Gly Are 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 9: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 590 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

CAAAAAAAAA AAATAAAGTC AAGAGTATTA ATTAGGTGAG AAAAAATGGT GAACATGAAG 60 

TGTGTTGCAT TGATAGTTAT AGTTATGATG GCGTTTATGA TGGTGGATCC ATCAATGGGA 120 

GTGGGAGAAT GTGTGAGAGG ACGTTGCCCA AGTGGGATGT GTTGCAGTCA GTTTGGGTAC 180 

TGTGGTAAAG GCCCAAAGTA CTGTGGCCGT GCCAGTACTA CTGTGGATCA CCAAGCTGAT 240 

GTTGCTGCCA CCAAAACTGC CAAGAATCCT ACCGATGCTA AACTTGCTGG TGCTGGTAGT 300 

CCATGAAAGT AGTAGCTAGC TAGGTTCACG TTGGATTACC AAGCCGTGCC AGTACTACTG 360 

TGGCCGTGCC AGTACTAATG TTCTCTTATA TGTCTGAAAT AAGCTCCTAT ATAAATACTA 420 

GTATCTTGAT GTAATGGAGT ATTTTCATTT TGTTTTTATT TGAGTTATGA TCGTGACTTC 480 
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CTTGTGTTGG TTTAACTTGT ATATTGTAAT GCATCTTAAA TGCTGTCTCA AATAATTTGA 540 
TGTATTAAAC ACTTGTTTTG TTTTTAATAC ATACTAAGTG CTGTAAATTC 590 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 86 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION; SEQ ID N0;10: 

Met Val Asn Met Lys Cys Val Ala Leu lie Val lie Val Met Met Ala 
1 5 .10 15 

Phe Met Met Val Asp Pro Ser Met Gly Val Gly Glu Cys Val Arg Gly 
20 25 30 

Arg Cys Pro Ser Gly Met Cys Cys Ser Gin Phe Gly Tyr Cys Gly Lys 
35 40 45 

Gly Pro Lys Tyr Cys Gly Arg Ala Ser Thr Thr Val Asp His Gin Ala 
50 55 60 

Asp Val Ala Ala Thr Lys Thr Ala Lys Asn Pro Thr Asp Ala Lys Leu 
65 70 75 80 

Ala Gly Ala Gly Ser Pro 
85 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Gin Glu Gin Cys Gly Asn Gin Ala Gly Gly Arg Ala Cys Ala Asn Arg 
1 5 10 15 
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Leu Cys Cys Ser Gin Tyr Gly Tyr Cys Gly Ser Thr Are Ala Tyr Cys 
20 25 30 

Gly Val Gly Cys Gin Ser Asn Cys Gly Arg 
35 40 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CAAGAGCAAT GCGGAAACCA AGCTGGAGGA AGAGCTTGCG CTAACAGACT TTGCTGCTCT 60 
CAATACGGAT ACTGCGGATC TACTAGAGCT TACTGCGGAG TTGGATGCCA ATCTAACTGC 120 
GGAAGA 126 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-si te 

(B) LOCATION: 15 

(D) OTHER INFORMATION: /note= "Xaa at position 15 may be R 
or H" 

(ix) FEATURE: 

(A) NAME/KEY: Modif ied-si te 

(B) LOCATION: 29 

(D) OTHER INFORMATION: /note= "Xaa at position 29 may be S 
or N" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Cys Ser Ser His Asn Pro Cys Pro Arg His Gin Cys Cys Ser Xaa Tyr 
15 10 15 
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Gly Tyr Cys Gly Leu Gly Ser Asp Tyr Cys Gly Leu Xaa Cys Arg Gly 
20 25 30 

Gly Pro Cys Asp Arg 
35 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 111 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TGCTCTTCTC ACAACCCGTG CCCGAGACAC CAATGCTGCT CTAAGTACGG ATACTGCGGA 60 
CTTGGATCTG ACTACTGCGG ACTTGGATGC AGAGGAGGAC CGTGCGACAG A 111 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
15 10 15 

Asn Asn Asn Ala Cys Lys Asn Gin Cys lie Asn Leu Glu Lys Ala Arg 
20 25 30 

His Gly Ser Cys Asn Tyr Val Phe Pro Ala His Lys 
35 40 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 



SUBSTITUTE SHEET 



wo 94/16076 



PCT/GB94/00012 



23 



(D) TOPOLOGY: linear 
<ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:16: 

Gin Lys Leu Cys Gin Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
1 5 10 15 ' 

Asn Asn Asn Ala Cys Lys Asn Gin Cys lie Arg Leu Glu Lys Ala Arg 
20 25 30 

His Gly Ser Cys 
35 

(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

Trp Ser Gly Val 

15 



Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
1 5 10 ' 



Asn Asn Asn Ala Cys Lys Asn Gin Cys lie Asn 
20 25 

(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Xaa Ser Gly Val Cys Gly 
1 5 10 . 15 ' 
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Asn Asn Asn Ala Cys Lys Asn Gin Cys lie Arg 
20 25 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
15 10 15 

Asn Asn Asn Ala Cys Lys Asn Gin Cys lie Asn Leu Glu Lys 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
1 5 10 • 15 

Asn Asn Asn Ala Cys Lys Asn 
20 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
15 10 15 

Asn Asn Asn Ala Cys Lys Asn Gin Cys 
20 25 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Gin Lys Leu Cys Gin Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
15 10 15 

Asn Asn Asn Ala Cys Arg Asn Gin Cys lie 
20 25 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Gin Lys Leu Cys Glu Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly 
15 10 15 

Asn Ser Asn Ala Cys Lys Asn Gin Cys lie Asn 
20 25 

(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 
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(C) STRANOEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

Glu Leu Cys Glu Lys Ala Ser Lys Thr Trp Ser Gly Asn Cys Gly Asn 
15 10 15 

Thr Gly His Cys Asp Asn Gin Cys Lys Ser Trp Glu Gly Ala Ala His 
20 25 30 

Gly Ala Cys His Val Arg Asn Gly Lys His Met Cys Phe Cys Tyr Phe 
35 40 45 

Asn Cys 
50 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Glu Val Cys Glu Lys Ala Ser Lys Thr Trp Ser Gly Asn Cys Gly Asn 
15 10 -15 

Thr Gly His Cys 
20 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



SUBSTITUTE SHEET 



wo 94/16076 



27 



PCT/GB94/00012 



Glu Leu Cys Glu Lys Ala Ser Lys Thr Trp Ser Gly Asn Cys Gly Asn 
1 5 10 15 

Thr Lys His Cys Asp Asp Gin Cys Lys Ser Trp Glu Gly Ala Ala His 
20 25 30 

Gly Ala Cys His Val Arg Asn Gly Lys His Met Cys Phe Cys Tyr Phe 
35 40 45 

Asn Cys 
50 

(2) INFORMATION FOR SEQ ID NO: 27; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 50 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single ■ 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

Glu Leu Cys Glu. Lys Ala Ser Lys Thr Trp Ser Gly Asn Cys Gly Asn 
1 5 10 15 

Thr Lys His Cys Asp Asn Lys Cys Lys Ser Trp Glu Gly Ala Ala His 
20 25 30 

Gly Ala Cys His Val Arg Ser Gly Lys His Met Cys Phe Cys Tyr Phe 
35 40 45 

Asn Cys 
50 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

Lys Thr Cys Glu Asn Leu Ser Gly Thr Phe Lys Gly Pro Cys He Pro 
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1 5 10 15 

Asp Gly Asn Cys Asn Lys His Cys Lys Asn Asn Glu His Leu Leu Ser 
20 25 30 

Gly Arg Cys Arg Asp Asp Phe Xaa Cys Trp Cys Thr Arg Asn Cys 
35 AO 45 

(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 49 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Asn Leu Cys Glu Arg Ala Ser Leu Thr Trp Thr Gly Asn Cys Gly Asn 
15 10 15 

Thr Gly His Cys Asp Thr Gin Cys Arg Asn Trp Glu Ser Ala Lys His 
20 25 30 

Gly Ala Cys His Lys Arg Gly Asn Trp Lys Cys Phe Cys Tyr Phe Asp 
35 40 45 

Cys 



(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Ala Leu Ser Cys Gly Thr Val Asn Ser Asn Leu Ala Ala Cys lie Gly 
1 5 10 15 

Tyr Leu Thr Gin Asn Ala Pro Leu Ala Arg Gly Cys Cys Thr Gly Val 
20 25 30 
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Thr Asn Leu Asn Asn Met Ala Xaa Thr Thr Pro 
35 40 

(2) INFORMATION FOR SEQ ID NO 2 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 150 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
GAGCTTTGCG AGAAGGCTTC TAAGACTTGG TCTGGAAACT GCGGAAACAC TGGACATTGC 
GATAACCAAT GCAAGTCTTG GGAGGGAGCT GCTCATGGAG CTTGCCATGT TAGAAACGGA 
AAGCATATGT GCTTCTGCTA CTTCAACTGC 
(2) INFORMATION FOR SEQ ID NO 2 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 60 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
GAGGTTTGCG AGAAGGCTTC TAAGACTTGG TCTGGAAACT GCGGAAACAC TGGACATTGC 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 150 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
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GAGCTTTGCG AGAAGGCTTC TAAGACTTGG TCTGGAAACT GCGGAAACAC TAAGCATTGC 60 

GATGATCAAT GCAAGTCTTG GGAGGGAGCT GCTCATGGAG CTTGCCATGT TAGAAAGGGA 120 

AAGCATATGT GCTTCTGCTA CTTCAACTGC 150 
(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 150 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GAGCTTTGCG AGAAGGCTTC TAAGACTTGG TCTGGAAACT GCGGAAACAC TAAGCATTGC 60 
GATAACAAGT GCAAGTCTTG GGAGGGAGCT GCTCATGGAG CTTGCCATGT TAGATCTGGA 120 
AAGCATATGT GCTTCTGCTA CTTCAACTGC 150 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 141 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
AAGACTTGCG AGAACCTTTC TGGAACTTTC AAGGGACCAT GCATTCCAGA TGGAAACTGC 60 
AACAAGCATT GCAAGAACAA CGAGCATCTT CTTTCTGGAA GATGCAGAGA TGATTTCNNN 120 
TGCTGGTGCA CTAGAAACTG C , 141 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 147 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 
AACCTTTGCG AGAGAGCTTC TCTTACTTGG ACTGGAAACT GCGGAAACAC TGGACATTGC 60 
GATACTCAAT GCAGAAACTG GGAGTCTGCT AAGCATGGAG CTTGCCATAA GAGAGGAAAC 120 
TGGAAGTGCT TCTGCTACTT CGATTGC 147 
(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 414 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 



GTTTTATTAG TGATCATGGC TAAGTTTGCG TCCATCATCG 


CACTTCTTTT 


TGCTGCTCTT 


60 


GTTCTTTTTG CTGCTTTCGA AGCACCAACA ATGGTGGAAG 


CACAGAAGTT 


GTGCGAAAGG 


120 


CCAAGTGGGA CATGGTCAGG AGTCTGTGGA AACAATAACG 


CATGCAAGAA 


TCAGTGCATT 


180 


AACCTTGAGA AAGCACGACA TGGATCTTGC AACTATGTCT 


TCCCAGCTCA 


CAAGTGTATC 


240 


TGCTACTTTC CTTGTTAATT TATCGCAAAC TCTTTGGTGA 


ATAGTTTTTA 


TGTAATTTAC 


300 


ACAAAATAAG TCAGTGTCAC TATCCATGAG TGATTTTAAG 


ACATGTACCA 


GATATGTTAT 


360 


GTTGGTTCGG TTATACAAAT AAAGTTTTAT TCACCAAAAA 


AAAAAAAAAA 


AAAA 


414 


(2) INFORMATION FOR SEQ ID NO: 38: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 



SUBSTITUTE SHEET 



wo 94/16076 



PCT/GB94/00012 



32 



Met Ala Lys Phe Ala Ser lie lie 
1 5 



Ala Leu Leu Phe Ala Ala Leu Val 
10 15 



Leu Fhe Ala Ala Phe Glu Ala Pro 
20 



Thr Met Val Glu Ala Gin Lys Leu 
25 30 



Cys Glu Arg Pro Ser Gly Thr Trp 
35 40 



Ser Gly Val Cys Gly Asn Asn Asn 
45 



Ala Cys Lys Asn Gin Cys lie Asn 
50 55 



Leu Glu Lys Ala Arg His Gly Ser 
60 



Cys Asn Tyr Val Phe Pro Ala His 
65 70 



Lys Cys lie Cys Tyr Phe Pro Cys 
75 80 



(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 284 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 

GGAAATAATA ACGCATGCAA GAATCAGTGC ATTCGACTTG AGAAAGCACG ACATGGGTCT 60 

TGCAACTATG TCTTCCCAGC TCACAAGTGT ATCTGTTATT TCCCTTGTTA ATTCCATAAA 120 

CTCTTCGGTG GTTAATAGTG TGCGCATATT ACATATAATT AATAAGTTTG TGTCACTATT 180 

TATTAGTGAC TTTATGACAT GTGCCAGGTA TGTTTATGTT GGGTTGGTTG TAATATAAAA 240 

AAGTTCACGG ATAATAAGAT GATAAGCTCA CGTCGCCAAA AAAA 284 
(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 amino acids 

(B) TYPE: amino acid 

(C) STRANDEONESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
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Gly Asn Asn Asn Ala Cys Lys Asn Gin Cys He Arg Leu Glu Lys Ala 
15 10 15 

Arg His Gly Ser Cys Asn Tyr Val Phe Pro Ala His Lys Cys He Cys 
20 25 30 

Tyr Phe Pro Cys 
35 

(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 288 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(li) MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESOUEPTION: SEQ ID NO: 41: 

CCCCGGGCTG CAGGAATTCG CGGCCGCGTT TTATTAGTGA TCATGGCTAA GTTTGCGTCC 60 

ATCATCGCAC TTCTTTTTGC TGCTCTTGTT CTTTTTGCTG CTTTCGAAGC ACCAACAATG 120 

GTGGAAGCAC AGAAGTTGTG CCAAAGGCCA AGTGGGACAT GGTCAGGAGT CTGTGGAAAC 180 

AATAACGCAT GCAAGAATCA GTGCATTAGA CTTGAGAAAG CACGACATGG ATCTTGCAAC 240 
TATGTCTTCC CAGCTCACAA GTGTATCTGC TACTTTCCTT GTTAATAG . 288 
(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

Met Ala Lys Phe Ala Ser He He Ala Leu Leu Phe Ala Ala Leu Val 
15 10 15 

Leu Phe Ala Ala Phe Glu Ala Pro Thr Met Val Glu Ala Gin Lys Leu 
20 25 30 

Cys Gin Arg Pro Ser Gly Thr Trp Ser Gly Val Cys Gly Asn Asn Asn 
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35 AO ■ 45 

Ala Cys Lys Asn Gin Cys lie Arg Leu Glu Lys Ala Arg His Gly Ser 
50 55 60 

Cys Asn Tyr Val Phe Pro Ala His Lys Cys lie Cys Tyr Phe Pro Cys 
65 70 75 80 
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We claim: 

1. A method of producing an antimicrobial- 
protein-producing micro-organism capable of 
entering into an endosymbioti c relationship 
with a plant host comprising the combination 
of genetic material encoding a plant-derived 
antimicrobial protein with an endophyte. 

2. A method according to claim 1 in which the 
plant-derived antimicrobial protein is 
selected from the protein group consisting of 
Mj-AMPl, Mj-AMP2, Ac-AMPl , AC-AMP2, Ca-AMPl , 
Bm-AMPl, RS-AFPI, RS-AFP2, Br-AFPl, Br-AFP2, 
Bn-AFPl, Bn-AFP2, Sa-AFPl, Sa-AFP2, At-AFPl , 
Dm-AMPl, Dm-AMP2, Cb-AMPl, Cb-AMP2, Lc-AFP, 
Ct-AMPl, Ct-AMP2 and Rs-nsLTP, 

I. A method according to claim 1 in which the 
endopyte is Clavibacter xyli subsp. 
cynodontis , 

i. An antimicrobial-protein-producing 

micro-organism produced by the method - 
according to claim 1. 

A method for protecting a plant host from 
disease comprising treating the plant host 
with the antimicrobial-protein-producing 
micro-organism according to claim 4. 



6. 



A plant or seed treated with an 
antimicrobial-protein-producing 
according to claim 4. 



micro- 



organism 



INTTERNATIONAL SEARCH REPORT 



Inte. joal Appticatxon No 

PCT/GB 94/00012 



A. CLASSinCATION OF SUBIECT MATTER 

IPC 5 C12N15/29 C12N15/74 A01N63/00 Ci2Nl/21 A01H5/00 
A01H5/10 

According to IntemAnonal Patmt Qaaafication (IPQ or to both national daaificatioa and IPC 



B. FIELDS SEARCHED 



Miniinum documentation searched (classification system followed by dassifica&on symbols) 

IPC 5 C12N AOIN AOIH 



Documentation searched other than minimum documentation to the cxoent ttiac such documents are influijfd in the fidds seard&ed 



Becironic data base consulted during the international search (name of data base and, where practical, search tenns used) 



C. DOCUMENTS CONSIDERED TO BE RELEVANT 



Category * Qtation of document, with indicatioik, where appropriate, of the rdcvant passages 



Rdevant xo claim No. 



wo, A, 91 10363 (CROP GENETICS 
INTERNATIONAL) 25 July 1991 
see the whole document 

EP.A.O 474 601 (CIBA-GEIGY) 11 March 1992 
see the whole document 

EP,A,0 256 682 (ICI) 24 February 1988 
see the whole document 



□ 



Fiirthcr documents are listed in the continuation of box C. 



0 



Patent family members are listed in 



* Special categories of dted documents : 

*A* document defining ttie general state of the art which is not 
considered to be of particular relevance 

*£* earlier document but published on or ailer the international 
filing date 

'L* document which may throw doubts on priority daim($) or 
which is dted to establish the publication date of another 
dtahon or other spedai reason (as specified) 

'O* document referring to an oral disdosure, use, odiibition or 
other rrwtiiTis 

^F* document publi Acd prior to the international filing date but 
later than the priority date claimed 



"T* later document published after the international filing date 
or phohty date and not in conflict with the appliGanon but 
dted to understind the prind|de or theory UEuferiying the 
invention 

"X* document of particular rdevance; the daimed invention 
cannot be considered novel or cannot be cotuidered to 
involve an inventive step when the document is taken alone 

*Y' document of particular relevance; the daimed invention 
cannot be considered to involve an inventive step when the 
document is oombined with one or more other such docu- 
ments, such combination being obvious to a person Allied 
in the art. 

document monfacr of the same patent family 



Date of the actual completion of the international search 



14 April 1994 



Date of mailing of the interaahonal search report 



2 7 -Oif 1994 



Name and mailing address of the ISA 

European Patent Oifice. P.B. 581 8 Patentlaan 2 
NL - 2280 HV Ri)swijk 
Td. ( + 31-70) 340-2040, Tx. 31 651 epo n!. 
Fajc (--31-70) 340-3016 



Authoriaed officer 



Maddox, A 



Fonn PCT/lSA/310 (Mcond theet) (July 1992) 



INTERNATIONAL SEARCH REPORT 

' ^ifonnAtion on paat Cunily faonbers 



Intel jnal Application No 

PCT/GB 94/00012 



Patent document 
cited in search report 


1 Publication 
1 date 


Patent family . 
meinber(5) 


Publication 
date 


WO-A-9 110363 


25-07-91 


AU-A- 


7159291 


05-08-91 


EP-A-0474601 


11-03-92 


All D 

AU-B- 
AU-A- 
CA-A- 


646492 
8372091 
2050743 


24-02-94 
12-03-92 
08-03-92 


EP-A-0256682 


24-02-88 


AU-A- 
JP-A- 


7627787 
63041410 


11-02-88 
22-02-88 



Ftem PCT/ISA/310 (patent funUy aanex) (July 1993) 



